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BACKGROUND: Reported declines in sperm counts remain controversial today and recent trends are unknown. A definitive meta- 
analysis is critical given the predictive value of sperm count for fertility, morbidity and mortality. 


OBJECTIVE AND RATIONALE: To provide a systematic review and meta-regression analysis of recent trends in sperm counts as 
measured by sperm concentration (SC) and total sperm count (TSC), and their modification by fertility and geographic group. 


SEARCH METHODS: PubMed/MEDLINE and EMBASE were searched for English language studies of human SC published in 
1981-2013. Following a predefined protocol 7518 abstracts were screened and 2510 full articles reporting primary data on SC were 
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reviewed. A total of 244 estimates of SC and TSC from 185 studies of 42935 men who provided semen samples in 1973-2011 were 
extracted for meta-regression analysis, as well as information on years of sample collection and covariates [fertility group (‘Unselected by 
fertility’ versus ‘Fertile’), geographic group (‘Western’, including North America, Europe Australia and New Zealand versus ‘Other’, includ- 
ing South America, Asia and Africa), age, ejaculation abstinence time, semen collection method, method of measuring SC and semen vol- 
ume, exclusion criteria and indicators of completeness of covariate data]. The slopes of SC and TSC were estimated as functions of 
sample collection year using both simple linear regression and weighted meta-regression models and the latter were adjusted for pre- 
determined covariates and modification by fertility and geographic group. Assumptions were examined using multiple sensitivity analyses 
and nonlinear models. 


OUTCOMES: SC declined significantly between 1973 and 2011 (slope in unadjusted simple regression models —0.70 million/ml/year; 
95% Cl: —0.72 to —0.69; P < 0.001; slope in adjusted meta-regression models = —0.64; —1.06 to —0.22; P = 0.003). The slopes in the 
meta-regression model were modified by fertility (P for interaction = 0.064) and geographic group (P for interaction = 0.027). There was a 
significant decline in SC between 1973 and 2011 among Unselected Western (—1.38; —2.02 to —0.74; P < 0.001) and among Fertile 
Western (—0.68; —1.31 to —0.05; P = 0.033), while no significant trends were seen among Unselected Other and Fertile Other. Among 
Unselected Western studies, the mean SC declined, on average, |.4% per year with an overall decline of 52.4% between 1973 and 2011. 
Trends for TSC and SC were similar, with a steep decline among Unselected Western (—5.33 million/year, —7.56 to —3.1 1; P < 0.001), 
corresponding to an average decline in mean TSC of |.6% per year and overall decline of 59.3%. Results changed minimally in multiple sen- 
sitivity analyses, and there was no statistical support for the use of a nonlinear model. In a model restricted to data post-1995, the slope 
both for SC and TSC among Unselected Western was similar to that for the entire period (—2.06 million/ml, —3.38 to —0.74; P = 0.004 
and —8.12 million, —13.73 to —2.51, P = 0.006, respectively). 


WIDER IMPLICATIONS: This comprehensive meta-regression analysis reports a significant decline in sperm counts (as measured by 
SC and TSC) between 1973 and 2011, driven by a 50-60% decline among men unselected by fertility from North America, Europe, 
Australia and New Zealand. Because of the significant public health implications of these results, research on the causes of this continuing 
decline is urgently needed. 
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Introduction 


Have sperm counts declined? This question remains as controversial 
today as in 1992 when Carlsen et al. (1992) wrote that: “There has 
been a genuine decline in semen quality over the past 50 years’. This 
controversy has continued unabated both because of the importance 
of the question and limitations in studies that have attempted to 
address it (Swan et al., 2000; Safe, 2013; Te Velde and Bonde, 2013). 

Sperm count is of considerable public health importance for sev- 
eral reasons. First, sperm count is closely linked to male fecundity 
and is a crucial component of semen analysis, the first step to identify 
male factor infertility (World Health Organization, 2010; Wang and 
Swerdloff, 2014). The economic and societal burden of male infertility 
is high and increasing (Winters and Walsh, 2014; Hauser et al., 2015; 
Skakkebaek et al., 2016). Second, reduced sperm count predicts 
increased all-cause mortality and morbidity (Jensen et al., 2009; 
Eisenberg et al., 2014b, 2016). Third, reduced sperm count is asso- 
ciated with cryptorchidism, hypospadias and testicular cancer, sug- 
gesting a shared prenatal etiology (Skakkebaek et al., 2016). Fourth, 
sperm count and other semen parameters have been plausibly asso- 
ciated with multiple environmental influences, including endocrine dis- 
rupting chemicals (Bloom et al., 2015; Gore et al., 2015), pesticides 
(Chiu et al., 2016), heat (Zhang et al., 2015) and lifestyle factors, 
including diet (Afeiche et al., 2013; Jensen et al., 2013), stress 
(Gollenberg et al., 2010; Nordkap et al., 2016), smoking (Sharma 
et al., 2016) and BMI (Sermondade et al., 2013; Eisenberg et dl., 
2014a). Therefore, sperm count may sensitively reflect the impacts of 


the modern environment on male health throughout the life course 
(Nordkap et al., 2012). 

Given this background, we conducted a rigorous and complete sys- 
tematic review and meta-regression analysis of recent trends in sperm 
count as measured by sperm concentration (SC) and total sperm 
count (TSC), and their modification by fertility and geographic group. 


Methods 


This systematic review and meta-regression analysis was conducted and 
the results reported in accordance with MOOSE (Meta-analysis in 
Observational Studies in Epidemiology) (Stroup et al., 2000) and PRISMA 
(Preferred Reporting Items for Systematic reviews and Meta-Analysis) 
guidelines (Liberati et al., 2009; Moher et al., 2009) [checklists available 
upon request—contact corresponding author for access]. Our research 
team included epidemiologists, andrologists and a qualified medical librar- 
ian, with consultation with an expert in meta-analysis. Our predefined 
protocol, detailed in Supplementary Information, was developed following 
best practices (Borenstein et al., 2009; Higgins and Green, 2011; Program 
NT, 2015), and informed by two pilot studies, the first using all 1996 
publications and the second all 1981 and 2013 publications. 


Systematic review 


The goal of the search was to identify all articles that reported primary data 
on human sperm count. We searched MEDLINE on November 21, 2014 
and Embase (Excerpta Medica database) on December |0, 2014 for peer- 
reviewed, English-language publications. Following the recommendation of 
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the Cochrane Handbook for Systematic Reviews, we searched in title and 
abstract for both index (MeSH) terms and keywords and filtered out 
animal-only studies. We used the MeSH term ‘sperm count’, which includes 
seven additional terms, and to increase sensitivity we added |3 related key- 
words (e.g. ‘sperm density’ and ‘sperm concentration’). We included all 
publications between January |, 1981 (the first full year after the term 
‘Sperm Count’ was added to MEDLINE as a MeSH term) and December 
31, 2013 (the last full year at the time we began our MEDLINE search). 

All studies that reported primary data on human SC were considered 
eligible for abstract screening. We evaluated the eligibility of all subgroups 
within a study. For example, in a case-control study, the control group 
might have been eligible for inclusion even though, based on our exclu- 
sion criteria, the case group was not. 

We divided eligible studies into two fertility-defined groups: men unse- 
lected by fertility status, hereafter ‘Unselected’ (e.g. young men unlikely 
to be aware of their fertility such as young men screened for military ser- 
vice or college students); and fertile men, hereafter ‘Fertile’ (e.g. men 
who were known to have conceived a pregnancy, such as fathers or part- 
ners of pregnant women regardless of pregnancy outcome). 

A study was excluded if study participants were selected based on: 
infertility or sub-fertility; range of semen parameters (e.g. studies selecting 
normospermic men); genital abnormalities, other diseases or medication. 
We also excluded studies limited to men with exposures that may affect 
fertility such as occupational exposure, post-intervention or smoking. 
Studies of candidates for vasectomy or semen donation were included 
only if semen quality was not a criterion for men’s study participation. 
Studies with fewer than 10 men and those that used non-standard meth- 
ods to collect or count sperm (e.g. methods other than masturbation for 
collection, or methods other than hemocytometer for counting) were 
also excluded. 

First, based on the title and abstract the publication was either 
excluded or advanced to full text screening. Any publication without an 
abstract was automatically referred for full text screening. Second, we 
reviewed the full text and assigned it to exclusion within a specific cat- 
egory, or data extraction. We then confirmed study eligibility and identi- 
fied multiple publications from the same study to ensure that estimates 
from the same population were not used more than once. 


Data extraction 


We extracted summary statistics on SC and TSC (mean, SD, SE, minimum, 
maximum, median, geometric mean and percentiles), mean or additional 
data on semen volume, sample size (for SC and for TSC), sample collec- 
tion years and covariates: fertility group, country, age, ejaculation abstin- 
ence time, methods of semen collection, methods of assessing of SC and 
semen volume, selection of population and study exclusion criteria as 
well as number of samples per man. The range of permissible values, 
both for categorical and numerical variables, and information on data 
completeness were recorded. Data were extracted on all eligible sub- 
groups separately as well as for the total population, if relevant. We 
attempted to extract data on additional potential confounders such as 
BMI, smoking and other lifestyle factors (e.g. alcohol and stress). 
However, except for smoking (which was examined in sensitivity ana- 
lysis), data were available for such variables in only a minority of studies 
so these were not included in meta-regression analyses. 


Quality control 


The study was conducted following a predefined protocol (Supplementary 
Information). Screening for this extensive systematic review was conducted 
by a team of eight reviewers (H.L., N.J., A.M.A., J.M., D.W.D., I.M., J.D.M., 
S.H.S.). The screening protocol was piloted by screening of 50 abstracts by 


all reviewers followed by a comparison of results, resolution of any incon- 
sistencies and clarification of the protocol as needed. The same quality 
control process was followed for full text screening (35 studies reviewed 
by all reviewers) and data extraction (data extracted from three studies by 
all reviewers). All data were entered into digital spreadsheets with explicit 
permissible values (no open-ended entries) to increase consistency. After 
data extraction, an additional round of data editing and quality control of 
all studies was conducted by H.L. The process ensured that each study 
was evaluated by at least two different reviewers. 


Statistical analysis 


We used point estimates of mean SC or mean TSC from individual stud- 
ies to model time trends during the study period, as measured by slope 
of SC or TSC per calendar year. The midpoint of the sample collection 
period was the independent variable in all analyses. Units were million/ml 
for SC and million for TSC (defined as SC xX sample volume) and all 
slopes denote unit change per calendar year. 

We first used simple linear regression models to estimate SC and TSC 
as functions of year of sample collection, with each study weighted by 
sample size. We then used random-effects meta-regression to model 
both SC and TSC as linear functions of time, weighting studies by the SE. 
In all meta-regression analyses, we included indicator variables to denote 
studies with more than one SC estimate. We controlled for a pre- 
determined set of potential confounders: fertility group, geographic 
group, age, abstinence time, whether semen collection and counting 
methods were reported, number of samples per man and indicators for 
exclusion criteria (Supplementary Table S1). 

For several key variables missing values were estimated and a variable 
was included in meta-regression analyses to denote that the value had 
been estimated. For example, for studies that reported median (not mean) 
SC or TSC, we estimated the mean by adding the average difference 
between the mean and median in studies for which both were reported. 
For studies that did not report the range or midpoint year of sample col- 
lection, the midpoint was estimated by subtracting the average difference 
between year of publication and midpoint year of sample collection in stud- 
ies for which both were reported from publication year. When SD but not 
SE of SC or TSC were reported, the SE was calculated by dividing the SD 
by the square root of sample size for each estimate. For studies that did 
not report SD or SE, we estimated SE by dividing the mean SD of studies 
that reported SD by the square root of sample size for this estimate. If 
mean TSC was not reported it was calculated by multiplying mean SC by 
mean semen volume (Supplementary Information). 

Our final analyses included two groups of countries. One group 
(referred to here as ‘Western’) includes studies from North America, 
Europe, Australia and New Zealand. The second (‘Other’/‘Non- 
Western’) includes studies from all other countries (from South America, 
Asia and Africa). We initially examined studies from North America sep- 
arately from Europe/Australia but combined these because trends were 
similar and only 16% of estimates were from North America. We 
assessed modification of slope by fertility group (Unselected versus 
Fertile) and geographic group (Western versus Other). Because of signifi- 
cant modification by fertility and geography, results of models with inter- 
action terms are presented for four categories: Unselected Western; 
Fertile Western; Unselected Other; and Fertile Other. Overall percent- 
age declines were calculated by estimating the sperm count (SC or TSC) 
in the first and last year of data collection, and dividing the difference by 
the estimate in the first year. The percentage decline per year was calcu- 
lated by dividing the overall percentage declines by the number of years. 

We ran all analyses for TSC weighting by SE of TSC and adjusted for 
method used to assess semen volume: weighing, read from pipette, read 
from tube or other. 


We conducted several sensitivity analyses; adding cubic and quadratic 
terms for year of sample collection in meta-regression analyses to assess 
non-linearity; excluding a specific group for each covariate, such as a 
group with incomplete information; removing covariates one at a time 
from the model; removing studies with SEs > 20 million/ml; replacing age 
group by mean age, excluding studies that did not report mean age; add- 
ing covariate for high smoking prevalence (>30%); excluding countries 
that contributed the greatest number of estimates in order to examine 
the influence of these countries; restricting analyses to studies with data 
collected after 1985 and after 1995 to examine recent trends. 

All analyses were conducted using STATA version 14.1 (StataCorp, 
TX, USA). A value of P < 0.05 was considered significant for main effect 
and P < 0.10 for interaction. 


Results 


Systematic review and summary statistics 


Using PubMed and Embase searches we identified 7518 publications 
meeting our criteria for abstract screening (Fig. |). Of these, 14 dupli- 
cate records were removed and 4994 were excluded based on title 
or abstract screening. Full texts of the remaining 2510 articles were 
reviewed for eligibility and 2179 studies were excluded. Of the 
remaining 331 articles, 146 were excluded during data extraction and 
the second round of full text screening (mainly due to multiple publi- 
cations). The meta-regression analysis is based on the remaining 185 
studies, which included 244 unique mean SC estimates based on 
samples collected between 1973 and 2011 from 42935 men. Data 
were available from 6 continents and 50 countries. The mean SC was 
81 million/ml, the mean TSC was 260 million and the mean year of 
data sample collection was 1995. Of the 244 estimates, 110 (45%) 
were Unselected Western, 65 (27%) Fertile Western, 30 (12%) 
Unselected Other and 39 (16%) Fertile Other. Data from the 185 
publications included in the meta-analysis are available upon request— 
contact corresponding author for access (Abyholm, 1981; Fariss 
et al., 1981; Leto and Frensilli, 1981; Wyrobek et al., 1981a,b; 
Aitken et al., 1982; Nieschlag et al., 1982; Obwaka et al., 1982; 
Albertsen et al., 1983; Fowler and Mariano, 1983; Sultan Sheriff, 
1983; Wickings et al., 1983; Asch et al., 1984; de Castro and 
Mastrorocco, 1984; Fredricsson and Sennerstam, 1984; Freischem 
et al., 1984; Ward et al., 1984; Ayers et al., 1985; Heussner et dl., 
1985; Rosenberg et al., 1985; Aribarg et al., 1986; Comhaire et dl., 
1987; Kirei, 1987; Giblin et al., 1988; Kjaergaard et al., 1988; 
Mieusset et al., 1988, 1995; Jockenhovel et al., 1989; Sobowale and 
Akiwumi, 1989; Svanborg et al., 1989; Zhong et al., 1990; Culasso 
et al., 1991; Dunphy et al., 1991; Gottlieb et a/., 1991; Nnatu et al., 
1991; Pangkahila, 1991; Weidner et al., 1991; Levine et al., 1992; 
Sheriff and Legnain, 1992; Ali et al., 1993; Arce et al., 1993; Bartoov 
et al., 1993; Fedder et al., 1993; Noack-Fuller et al., 1993; World 
Health Organization, 1993; Hill et al., 1994; Rehan, 1994; Rendon et al., 
1994; Taneja et al., 1994; Vanhoorne et al., 1994; Auger et al., 1995; 
Cottell and Harrison, 1995; Figa-Talamanca et al., 1996; Fisch et dl., 
1996; Irvine et al., 1996; Van Waeleghem et al., 1996; Vierula et dl., 
1996; Vine et al., 1996; Auger and Jouannet, 1997; Jensen et dl., 
1997; Lemcke et al., 1997; Handelsman, 1997a,b; Chia et al., 1998; 
Muller et al., 1998; Naz et al., 1998; Gyllenborg et al., 1999; Kolstad 
et al., 1999; Kuroki et al., 1999; Larsen et al., 1999; Purakayastha 
et al., 1999; Reddy and Bordekar, 1999; De Celis et al., 2000; 
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Glazier et al., 2000; Mak et al., 2000; Selevan et al., 2000; Wiltshire 
et al., 2000; Zhang et al., 2000; Foppiani et al., 2001; Guzick et dl., 
2001; Hammadeh et al., 2001; Jorgensen et al., 2001, 2002, 2011, 
2012; Kelleher et al., 2001; Lee and Coughlin, 2001; Patankar et al., 
2001; Tambe et al., 2001; Xiao et al., 2001; Costello et al., 2002; 
Junging et al., 2002; Kukuvitis et al., 2002; Luetjens et al., 2002; 
Punab et al., 2002; Richthoff et al., 2002; Danadevi et al., 2003; de 
Gouveia Brazao et al., 2003; Firman et al., 2003; Liu et al., 2003; 
Lundwall et al., 2003; Roste et al., 2003; Serra-Majem et al., 2003; 
Uhler et al., 2003; Xu et al., 2003; Ebesunun et al., 2004; Rintala 
et al., 2004; Toft et al., 2004, 2005; Bang et al., 2005; Mahmoud 
et al., 2005; Muthusami and Chinnaswamy, 2005; O’Donovan, 2005; 
Tsarev et al., 2005, 2009; Durazzo et al., 2006; Fetic et al., 2006; 
Giagulli and Carbone, 2006; Haugen et al., 2006; lwamoto et al., 
2006, 2013a,b; Pal et al., 2006; Yucra et al., 2006; Aneck-Hahn 
et al., 2007; Garcia et al., 2007; Multigner et al., 2007; Plastira et al., 
2007; Rignell-Hydbom et al., 2007; Wu et al., 2007; Akutsu et al., 
2008; Bhattacharya, 2008; Gallegos et al., 2008; Goulis et al., 2008; 
Jedrzejczak et al., 2008; Kobayashi et al., 2008; Korrovits et dl., 
2008; Li and Gu, 2008; Lopez-Teijon et al., 2008; Paasch et al., 
2008; Peters et al., 2008; Recabarren et al., 2008; Recio-Vega et al., 
2008; Saxena et al., 2008; Shine et al., 2008; Andrade-Rocha, 2009; 
Kumar et al., 2009, 2011; Rylander et al., 2009; Stewart et al., 2009; 
Vani et al., 2009, 2012; Verit et al., 2009; Engelbertz et al., 2010; 
Hossain et al., 2010; Ortiz et al., 2010; Rubes et al., 2010; Tirumala 
Vani et al., 2010; Al Momani et al., 2011; Auger and Eustache, 201 |; 
Axelsson et al., 2011; Brahem et al., 2011; Jacobsen et al., 2011; 
Khan et al., 2011; Linschooten et al., 2011; Venkatesh et al., 201 1; 
Vested et al., 2011; Absalan et al., 2012; Al-Janabi et al., 2012; 
Katukam et al., 2012; Mostafa et al., 2012; Nikoobakht et al., 2012; 
Rabelo-Junior et al., 2012; Splingart et al., 2012; Bujan et al., 2013; 
Girela et al., 2013; Halling et a/., 2013; Ji et al., 2013; Mendiola et al., 
2013; Redmon et al., 2013; Thilagavathi et al., 2013; Valsa et al., 
2013; Zalata et al., 2013; Zareba et al., 2013; Huang et al., 2014). 


Simple linear models 


Combining results from all four groups of men SC declined signifi- 
cantly (slope per year —0.70 million/ml; 95% Cl: —0.72 to —0.69; P < 
0.001) over the study period when using simple linear models 
(unadjusted, weighted by sample size) (Fig. 2a). SC declined by 0.75% 
per year (95% Cl: 0.73-0.77%) and overall by 28.5% between 1973 
and 2011. A similar trend was seen for TSC (slope per year = —2.23 
million; 95% Cl: —2.31 to —2.16; P < 0.001) (Fig. 2b), corresponding 
to a decline in TSC of 0.75% per year (95% Cl: 0.72-0.78%), and 
28.5% overall. Semen volume (156 estimates), did not change signifi- 
cantly over the study period (slope per year = 0.0003 ml; 95% Cl: 
—0.0003 to 0.0008; P = 0.382). 


Meta-regression models 


We ran meta-regression models, unadjusted and adjusted, with and 
without interaction terms for fertility and geographic groups 
(Supplementary Table S2). In the simple meta-regression model for 
SC, in which estimates were weighted by their SE but without covari- 
ate adjustment, slopes were similar to those for simple regression, 
but with wider Cls (SC slope = —0.68; —0.99 to —0.37; P < 0.001). 
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Figure | PRISMA Flow chart showing the selection of studies eligible for meta-regression analysis. 


Covariate adjustment did not appreciably alter the slope but widened 
the Cl further (—0.64; —1.06 to —0.22; P = 0.003). 

Slopes were significantly modified by the interaction of time with both 
fertility and geographic group. The three-way interaction term (time x 
fertility group x geographic group) was not significant (P = 0.57) and 
was not included in final models. In the final adjusted models for SC, 
which included two interaction terms [time x fertility group (P = 0.064) 
and time X geographic group (P = 0.027)], significant declines were seen 
among both Unselected Western (—1.38 million/ml/year, —2.02 to 
—0.74; P < 0.001) and Fertile Western (—0.68, —1.31 to —0.05; P = 
0.033) (Table |, Fig. 3a), with a steeper slope for Unselected Western. 
Using estimates from the fully adjusted model of 99.0 million/ml in 1973 


to 47.1 million/ml in 2011, SC in the Unselected Western group 
declined |.4% per year and overall by 52.4% between 1973 and 2011. 

In the final adjusted models for TSC (Table |), which included time x 
fertility group (P = 0.014) and time x geographic group (P = 0.021), in 
Western studies a steeper slope in TSC was seen among Unselected 
(—5.33 million/year, —7.56 to —3.11; P < 0.001) versus Fertile (—2.12, 
—4.31 to 0.07; P = 0.057) (Table |, Fig. 3b). Using estimates from the 
fully adjusted model of 337.5 million in 1973 to 137.5 million in 2011, 
TSC in the Unselected Western group declined 1.6% per year and 
overall by 59.3% between 1973 and 2011. 

No significant trends in SC or TSC were seen in Other countries 
overall, or for Unselected or Fertile men separately. 
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Figure 2 (a) Mean sperm concentration by year of sample collection in 244 estimates collected in 1973-2011 and simple linear regression. 
(b) Mean total sperm count by year of sample collection in 244 estimates collected in 1973-2011 and simple linear regression. 
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Figure 3 (a) Meta-regression model for mean sperm concentration by fertility and geographic groups, adjusted for potential confounders. 
(b) Meta-regression model for mean total sperm count by fertility and geographic groups, adjusted for potential confounders. Meta-regression model 
weighted by sperm concentration (SC) SE, adjusted for fertility group, time x fertility group interaction, geographic group, time x geographic group 
interaction, age, abstinence time, semen collection method reported, counting method reported, having more than one sample per men, indicators 
for study selection of population and exclusion criteria (some vasectomy candidates, some semen donor candidates, exclusion of men with chronic 
diseases, exclusion by other reasons not related to fertility, selection by occupation not related to fertility), whether year of collection was esti- 
mated, whether arithmetic mean of SC was estimated, whether SE of SC was estimated and indicator variable to denote studies with more than 
one estimate. Total sperm count (TSC) meta-regression models weighted by TSC SE, adjusted for similar covariates and method used to assess 


semen volume. 


Sensitivity analyses 


We performed multiple analyses to examine the sensitivity of results 
to assumptions about our model, influence of covariates, estimation 
of missing data, trends in SEs and study period. For the sake of brev- 
ity, results from sensitivity analyses are presented here for slope of 
SC in Unselected Western group. In all sensitivity analyses there was 


a significant (P < 0.01) and strong (>1.0 million/ml/year) decline for 
Unselected Western group. 


— Adding a quadratic or cubic function of year to meta-regression 
models did not substantially change the shape of the trend or 
improve model fit (as adjusted R-square declined), overall or within 
any of the geographic or fertility groups (coefficient for the 


Review and meta-regression of trends in sperm count 


Table | Sperm concentration and total sperm count in first and last years of meta-regression analysis with percentage 
change and slope per year, for all men and by fertility and geographic groups*. 





Category N First First year SC Last Last year SC Percentage Slope (95% Cl), 
(estimates) year (million/ml) year (million/ml) change/year million/ml/year 
All men 244 1973 92.8 2011 66.4 —0.75 —0.70 (—0.72 to —0.69) 
Unselected 110 1973 99.0 2011 47.1 —1.40 —1|.38 (—2.02 to —0.74) 
Western 
Fertile Western 65 1977 83.8 2009 62.0 —0.81 —0.68 (—1.31 to —0.05) 
Unselected Other 30 1986 72.7 2010 62.6 —0.58 —0.42 (—1.24 to 0.40) 
Fertile Other 39 1978 66.4 2011 75.7 0.42 0.28 (—0.44 to |.00) 
Category N First First year Last Last year Percentage Slope (95% Cl), 
(estimates) Year TSC (million) year TSC (million) change/year million/year 
All men 244 1973 295.7 2011 212.0 —0.75 —2.23 (—2.31 to —2.16) 
Unselected 110 1973 337.5 2011 137.5 —1.58 —5.33 (—7.56 to —3.1 1) 
Western 
Fertile Western 65 1977 277.4 2009 209.5 —0.76 —2.12 (—4.31 to 0.07) 
Unselected Other 30 1986 212.4 2010 167.3 —0.88 —1|.88 (—4.77 to 1.01) 
Fertile Other 39 1978 189.2 2011 233.2 0.70 1.33 (—1.20 to 3.86) 


*For all men: simple linear regression weighted by sample size. For all other categories: Meta-regression model weighted by sperm concentration (SC) SE, adjusted for fertility 
group, time x fertility group interaction, geographic group, time x geographic group interaction, age, abstinence time, semen collection method reported, counting method 
reported, having more than one sample per men, indicators for study selection of population and exclusion criteria (some vasectomy candidates, some semen donor candidates, 
exclusion of men with chronic diseases, exclusion by other reasons not related to fertility, selection by occupation not related to fertility), whether year of collection was esti- 
mated, whether arithmetic mean of SC was estimated, whether SE of SC was estimated and indicator variable to denote studies with more than one estimate. Total sperm count 
(TSC) meta-regression models weighted by TSC SE, adjusted for similar covariates and method used to assess semen volume. 


quadratic term: 0.0009; 95% Cl: —0.04 to 0.05, P = 0.969; for the 
cubic term —0.0003; 95% Cl: —0.0007 to 0.0007, P = 0.942). 

— Results of sensitivity analyses excluding a specific group for each 
covariate, or removing each covariate at a time from the model 
are in Supplementary Table S3. 

— After excluding nine estimates with a SE of SC > 20 million/ml, the 
slope for Unselected Western was —1.31 million/ml (—1.96 to 
—0.66; P = 0.001). 

— Excluding 85 studies with no data on mean age and adjusting for 
mean age instead of age group, yielded a slope of —1|.68 million/ml 
(—2.35 to —1.01; P < 0.001). 

— The proportion of smokers was reported in only 25% of studies. To 
examine this variable a sensitivity analysis including a covariate for 
high proportion of smokers (>30%) was performed, and slopes 
changed only slightly (— 1.39 million/ml, —2.03 to —0.75; P < 0.001). 

— The slope for Unselected Western did not change appreciably after 
excluding each country/region with more than |0 estimates at a 
time. Excluding 28 estimates from Australia and New Zealand, the 
slope for studies of unselected men from North America/Europe 
was —1.13 million/ml (—1.79 to —0.47; P = 0.001). Excluding esti- 
mates from the USA (n = 39) or Denmark (n = 19) the slopes 
were —1.46 million/ml (—2.25 to —0.67; P < 0.001) and —1.57 
million/ml (—2.26 to —0.87; P < 0.001), respectively. 

— Restricting the analysis to data from recent years (196 estimates 
collected post-1985) the slope (—1.57 million/ml, —2.51 to —0.62; 
P = 0.001) was similar to that for the full model. Restricting the 
analysis to data post-1995 (model restricted to 53 estimates of 
Unselected Western due to insufficient observations for interaction 


terms) the slope (—2.06 million/ml, —3.38 to —0.74; P = 0.004) 
was somewhat steeper. 


Results for TSC slope were also robust in all sensitivity analyses. 
Restricting the analysis to data post-1995 the slope (—8.12 million, 
—13.73 to —2.51; P = 0.006) was somewhat steeper. 


Discussion 


Key findings 

In this first systematic review and meta-regression analysis of tem- 
poral trends in sperm counts we report a significant overall decline in 
both SC and TSC in samples collected between 1973 and 2011. 
Declines were significant only in studies from North America, 
Europe, Australia (and New Zealand), where they were most pro- 
nounced among men unselected by fertility. In this latter group, SC 
declined 52.4% (—1.4% per year) and TSC 59.3% (—1.6% per year) 
over the study period. These slopes remained substantially 
unchanged after controlling for multiple preselected covariates (age, 
abstinence time, method of semen collection, method of counting 
sperm, selection of population and study exclusion criteria, number 
of samples per man and completeness of data) and in multiple sensi- 
tivity analyses. Thus, these data provide robust indication for a 
decline in SC and TSC in North America, Europe, Australia and New 
Zealand over the last 4 decades. There was no sign of ‘leveling off of 
the decline, when analyses were restricted to studies with sample 


collection in 1996-2011. 


Comparison to previous studies 


The overall decline in SC reported here (—0.70 million/ml/year) was 
consistent with, but not as steep as (—0.93 and —0.94 million/ml/ 
year), previously reported for an earlier period (Carlsen et al., 1992; 
Swan et al., 1997, 2000) (Table II). The annual percentage change in 
SC reported here was —0.75% million/ml, comparable to —0.83% 
reported by Carlsen et al. (1992). As in prior analyses (Swan et al., 
1997, 2000), we saw no significant declines for studies from South 
America, Asia and Africa, which may, in part be accounted for by lim- 
ited statistical power and an absence of studies in unselected men 
from these countries prior to 1985. However, we note that the 
modification of the slope by geographic group was significant. Thus, 
based on the results presented here, while it is not possible to rule 
out a trend in non-Western countries, these data do not support a 
decline as steep as that observed in Western countries. In the cur- 
rent analysis, declines in North America and Europe/Australia were 
similar, unlike prior analyses which included a higher proportion of 
studies from North America (Swan et al., 1997, 2000). 

Owing to the completeness of our search, our considerable sam- 
ple size across the entire study period and use of meta-regression 
methods, this analysis avoids many of the limitations of previous stud- 
ies. The study of Carlsen et al. (1992), which weighted studies by 
sample size, was criticized for having one study that included 30% of 
all subjects and for the paucity of data in the first 30 years of the ana- 
lysis (Olsen et al., 1995). The largest study in the current meta- 
regression analysis included only 5% of all subjects, sensitivity analyses 
demonstrated that no one country drove the overall trend, and stud- 
ies were well distributed over the 39 years of the study period and 
among 50 different countries. Furthermore, the meta-regression 
methods utilized in the current study addressed the issue of hetero- 
geneity in the reliability of study estimates by weighting of estimates 
by their SE. This conservative method inflates the Cl and is appropri- 
ate when the number of studies is sufficiently large, as it was in our 
analysis (Baker and Jackson, 2010). In addition, we adjusted for a pre- 
determined set of covariates, as well as variables indicating data com- 
pleteness and study exclusion criteria, thus avoiding the main pitfall in 
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reaching reliable conclusions 
(Thompson and Higgins, 2002). 
Our statistical power enabled us to assess modification by fertility 


from meta-regression analyses 


and geographic group. Modification by fertility group is especially 
important since fertile men represent a selected population, while 
unselected men are more likely to be representative of the general 
population. 

Some researchers have criticized the use of sperm count estimates 
from the past arguing that greater measurement error would be 
expected in historical studies. This is an unlikely explanation for the 
trend we report here for several reasons. First, unlike earlier analyses 
that included studies in which samples were collected as far back as 
1931, our analysis includes studies with samples collected only since 
1973. Even if measurements were less reliable in the past, this greater 
imprecision would produce greater uncertainty in earlier studies but 
not a change in slope. Further, since we weighted estimates by their 
SE, we avoided this hypothetical limitation. In addition, results were 
robust in sensitivity analyses that excluded studies in which SE was 
estimated, or very large. 

Chance is an unlikely explanation for our results, which were signifi- 
cant even in the more conservative meta-regression models. We used 
written protocols and extensive quality control procedures to minimize 
potential information and selection bias in all steps of the study. 


Limitations 


There are several possible limitations to this systematic review and 
meta-regression analysis. It is possible that failure to include non- 
English publications may have limited our analyses of non-Western 
countries. It has been claimed that men who are willing to provide 
semen sample may differ from the rest of the population leading to 
potential selection bias, but current evidence does not support this 
claim (Cooper et al., 2010). 

We analyzed sperm counts (both by SC and TSC) but not sperm 
motility and morphology because information regarding motility and 
morphology were seldom available in older studies. Moreover, the 
recommended methods and criteria for motility and morphology 


Table II Characteristics and results of fitting a simple linear regression model (without adjustment, weighted by 
sample size) for trends of sperm concentration in the current study, in Carlsen et al. (1992), and in Swan et al. (2000). 


981-2013 
85 (244 estimates) 


Publication years 

Number of studies 
Number of countries 50 
Fertility group: N (%) 
Fertile 04 (43%) 
Unselected 40 (57%) 


Geographic group: N (%) 





Western? 75 (72%) 
Other 69 (28%) 
Slope —0.70 
P-value <0.001 


“Wife pregnant or post-partum or at least 90% of men with proven fertility. 


Levine et al. (2017, current study) 


Carlsen et al. (1992) Swan et al. (2000) 


1938-1990 1934-1996 
61 101 

20 28 

39 (64%) 51 (50%)* 
22 (36%) 50 (50%) 
45 (74%) 78 (77%) 
16 (26%) 23 (23%) 
~0.93 ~0.94 
<0.001 <0.001 


Western includes studies from North America, Europe and Australia (and New Zealand). Other includes studies from all other countries. 
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assessments have changed significantly over time making across-time 
comparisons difficult. In contrast, the assessment of SC by hemocyt- 
ometer, first described in 1902 (Benedict, 1902), has been the meth- 
od recommended by the World Health Organization since 1980 
(World Health Organization, 2010), and there is no evidence that 
this method has varied systematically over time. For these reasons 
SC is considered to be the most reliable endpoint for epidemiological 
analysis (Le Moal et al., 2016). Because of this stability and the vari- 
ability of other counting methods over time we only included studies 
in which counting was done (or likely done) by hemocytometer and 
excluded studies that used alternative counting chambers (e.g. Makler, 
Coulter and Microcell) or non-manual methods (i.e. computer assisted 
sperm analysis or flow cytometry). Even though we followed detailed 
protocol, this study was not preregistered in Prospero. 

Analysing trends by birth cohorts instead of year of sample collec- 
tion may aid in assessing the causes of the decline (prenatal or in 
adult life) but was not feasible owing to lack of information. 


Wider implications 

This rigorous and comprehensive analysis finds that SC declined 
52.4% between 1973 and 201! among unselected men from 
Western countries, with no evidence of a ‘leveling off in recent 
years. Declining mean SC implies that an increasing proportion of 
men have sperm counts below any given threshold for sub-fertility or 
infertility. The high proportion of men from western countries with 
concentration below 40 million/ml is particularly concerning given the 
evidence that SC below this threshold is associated with a decreased 
monthly probability of conception (Bonde et al., 1998). 

Declines in sperm count have implications beyond fertility and 
reproduction. The decline we report here is consistent with reported 
trends in other male reproductive health indicators, such as testicular 
germ cell tumors, cryptorchidism, onset of male puberty and total 
testosterone levels (Skakkebaek et al., 2016). The public health impli- 
cations are even wider. Recent studies have shown that poor sperm 
count is associated with overall morbidity and mortality (Jensen et dl., 
2009; Eisenberg et al., 2014b, 2016; Latif et al., 2017). While the cur- 
rent study is not designed to provide direct information on the causes 
of the observed declines, sperm count has been plausibly associated 
with multiple environmental and lifestyle influences, both prenatally 
and in adult life. In particular, endocrine disruption from chemical 
exposures or maternal smoking during critical windows of male 
reproductive development may play a role in prenatal life, while life- 
style changes and exposure to pesticides may play a role in adult life. 
Thus, a decline in sperm count might be considered as a ‘canary in 
the coal mine’ for male health across the lifespan. Our report of a 
continuing and robust decline should, therefore, trigger research into 
its causes, aiming for prevention. 


Conclusion 


In this comprehensive meta-analysis, sperm counts whether mea- 
sured by SC or TSC declined significantly among men from North 
America, Europe and Australia during 1973-2011, with a 50-60% 
decline among men unselected by fertility, with no evidence of a ‘lev- 
eling off in recent years. These findings strongly suggest a significant 


decline in male reproductive health, which has serious implications 
beyond fertility concerns. Research on causes and implications of this 
decline is urgently needed. 


Supplementary data 


Supplementary data are available at Human Reproduction Update online. 
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